Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change the mapper output directory from $TMP/shards to $TMP/map_output #3960

Merged
merged 2 commits into from
Sep 19, 2019

Conversation

gitlw
Copy link

@gitlw gitlw commented Sep 11, 2019

In #3959 , bulk loader crashes when trying to move a directory into itself with a new name
/dgraph/tmp/shards/shard_0
/dgraph/tmp/shards/shard_0/shard_0

The bulk loader logic is

  1. the mapper produce output as
    .../tmp/shards/000
    .../tmp/shards/001

  2. read the list of shards under .../tmp/shards/

  3. create the reducer shards as
    .../tmp/shards/shard_0
    .../tmp/shards/shard_1

  4. move the list read in step 2 into the reducer shards created in step 3

Though I cannot reproduce the problem, but it seems creating of the reducer shard directory .../tmp/shards/shard_0 and listing all the mapper shards in step 2 are re-ordered. Something similar is mentioned in etcd-io/etcd#6368

This PR avoids such possibilities by putting the mapper output into an independent directory
../tmp/map_output, so that the program works correctly even if the reordering happens.


This change is Reviewable

@gitlw gitlw requested review from manishrjain and a team as code owners September 11, 2019 00:51
Copy link

@pullrequest pullrequest bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ A review job has been created and sent to the PullRequest network.


@gitlw you can click here to see the review status or cancel the code review job.

Copy link

@pullrequest pullrequest bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed context via the PR description. Going through, the PR seems to fix the issue as a workaround as mentioned. In terms of the actual big, it's still there if someone decided in the future to do something similar - so I would suggest adding a comment that the root issue still exists/but you were unable to reproduce.


Reviewed with ❤️ by PullRequest

Copy link
Contributor

@ashish-goswami ashish-goswami left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: 0 of 3 files reviewed, all discussions resolved (waiting on @manishrjain)

@mangalaman93
Copy link
Contributor

I have a question. Wouldn't it make more sense to run fsync after creating or moving directories/files? That will guarantee that operations are executed in order.

@gitlw
Copy link
Author

gitlw commented Sep 11, 2019

@mangalaman93 I think fsync would help with other types of problems.
For instance, if we are creating a subdirectory and then try to list all children under the parent, and we cannot find the newly created child, fsync could potentially help.
But in this case, fsync would not help because we are receiving a child directory which should not show up.

Copy link
Contributor

@martinmr martinmr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 3 files at r1.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @gitlw and @manishrjain)


dgraph/cmd/bulk/mapper.go, line 92 at r1 (raw file):

	filename := filepath.Join(
		m.opt.TmpDir,
		"map_output",

perhaps make this as constant.


dgraph/cmd/bulk/merge_shards.go, line 30 at r1 (raw file):

func mergeMapShardsIntoReduceShards(opt options) {
	mapShards := shardDirs(opt.TmpDir + "/map_output")

use filepaht.Join to make this logic more robust


dgraph/cmd/bulk/reduce.go, line 48 at r1 (raw file):

func (r *reducer) run() error {
	dirs := shardDirs(r.opt.TmpDir + "/shards")

also use filepath.Join here

Copy link
Contributor

@ashish-goswami ashish-goswami left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 3 files reviewed, 3 unresolved discussions (waiting on @manishrjain and @martinmr)


dgraph/cmd/bulk/mapper.go, line 92 at r1 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

perhaps make this as constant.

Done.


dgraph/cmd/bulk/merge_shards.go, line 30 at r1 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

use filepaht.Join to make this logic more robust

Done.


dgraph/cmd/bulk/reduce.go, line 48 at r1 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

also use filepath.Join here

Done.

Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 3 of 3 files at r2.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @martinmr)

@ashish-goswami ashish-goswami merged commit cd0e208 into master Sep 19, 2019
@ashish-goswami ashish-goswami deleted the gitlw/change_mapper_dir branch September 19, 2019 08:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

5 participants